COMPOUND POISSON APPROXIMATION FOR TRIANGULAR 
ARRAYS WITH APPLICATION TO THRESHOLD ESTIMATION 



P. CHIGANSKY AND F.C. KLEBANER 

Abstract. We prove weak convergence of triangular arrays to the compound Poisson 
limit using Tihomirov's method. The result is applied to statistical estimation of the 
threshold parameter in autoregressive models. 



1. Introduction and main result 

This paper is concerned with weak convergence of sums over triangular arrays with 
certain dependence structure to the compound Poisson distribution. It is motivated by the 
threshold estimation problem, described in details in Section [2j We consider triangular 
arrays of random variables Ynj, j = l,...,n, n € N with rows, adapted to a filtration 
(Sj), j € N. Ynj^s are asymptotically negligible and satisfy a weak dependence (mixing) 
condition made precise by the following assumptions. 



(Al) there is a constant Ci > 0, such that 

n 

and 



iYn,j^O)<^, and E|y„,,|<^, j = l,...,n 



(A2) there is an integer i > 1, such that 



— '1 

<—a{j-i), i<j-e 
n 



where a(n) > is a decreasing sequence with lim„^oo a(n) = 
(A3) for a measurable function \v{x)\ < 1, 2; € M""-'"'"-^ 

E(f(y„j,...,y„,„)|Ji) -E7;(y„j,...,y„,,„) <a{j-i), i<j 
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The following condition on the individual characteristic functions (f)n,j{t) = Ee'*^"'-' 
together with the above assumptions, will assure convergence of the sums 

n 

i=i 

to the compound Poisson law (hereafter we shall abbreviate 93(4) = ■^(p(t), etc.): 

(A4) There exists a characteristic function ip{t) and positive constants C2 and /x such 
that 

<j)n,j{t) - n~^fiip{t) < Can"^ t e M. 

Note that the mixing in IA2I and IA3I can be arbitrarily weak. Further assumptions on 
the rate of convergence of a{k) to zero, such as: 

(A5) a{k) < Csr'' for some r G (0, 1) and C3 > 0. 

allow to obtain rates of convergence in an appropriate metric. Below we shall work with 
the Levy distance, defined for a pair of distribution functions F and G by (see e.g. [10| ) 

L(G, F) =mi{h>0:G{x-h)-h< F{x) < G{x + h) + h, V x}. 

Our main result is the following: 

Theorem 1.1. Let Ynj, j = l,...,n, n G N 6e a triangular array of random variables, 
whose rows are adapted to a filtration (Sj), j G N and satisfy the assumptions \A1^A4\ 
Then 

n 

Sn = y^Yn,j S, (1.1) 

^ — ' n— >-oo 

i=i 

where S has the compound Poisson distribution, with intensity /i and i.i.d. jumps with 
characteristic function ip{t). 

Moreover, if the assumption \A^ holds then there is a constant C > 0, such that for all 
n large enough, 

L{^{Sn),^{S)) < Cn-'/^ log n, (1.2) 

where L(^^{Sn),^{S)^ is the Levy distance between the distribution functions of Sn and 
S. 

Remark 1.2. Both the constant C and the smallest n for which (|1.2|) holds, can be found 
explicitly in terms of the Cj's and a(-), mentioned in the assumptions above. Also bounds 
on the Levy distance can be obtained similarly for e.g. polynomially decreasing a(-), by 
replacing 61ogn with for some (5 > in the proof of Theorem 1.1 and optimizing the 
right hand side of the corresponding inequality, analogous to (13. 5p below. 

In application to threshold estimation, Y^j is derived from an autoregressive stationary 
process Xj, generated by the recursion 

Xj = hiXj_i) + ej, j>l, (1.3) 
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where h{-) is a given measurable function and (sj) is a sequence of i.i.d. random variables, 
with continuous positive probability density q{-). As explained in Section [21 in this context 

yn,j ■= /(ei)l{Xj_ig_B„}) 

where Bn ■= [0, 1/n], /(•) is a measurable function and 



n 



i=i 

Theorem 11.11 implies that under appropriate conditions, Sn converges weakly to the com- 
pound Poisson random variable with i.i.d. jumps, distributed as /(ei), and the intensity 
/i := p{0), where p{-) is the unique invariant density of (Xj). 

Somewhat surprisingly, we were not able to find in the literature a general result, from 
which this limit could be deduced. In this regard, one naturally thinks of Stein's method 
or martingale convergence results. Stein's method appears to be particularly well suited 
to the compound Poisson distribution with inie^er valued jumps (see e.g. [3], [2]). The 
results such as [1], [181 IS]) [S] or [17] come close, but apparently do not quite fit our 
setting. 

In the particular case, when E/(ej) = 0, Sn becomes a sum over the array of martingale 
differences Ynj := f{£j)'i-{Xj^ieB„}j 3 = 1) •••i'^ with the quadratic variation sequence 

m 

Vn,m = ^ l{x,_iGi?„}IE/^(ei), m = 1, n. 
i=i 

A typical martingale limit result such as e.g. [6] or Theorem 2.27 Ch. VIII §2c in [13] 
requires that Vn^n converges in probability. However in our case Vn^n converges only in 
distribution (to a Poisson random variable), but not in probability (since e.g. Vn^n is 
uniformly integrable, but is not a Cauchy sequence in Li). It is known that 5„ may have 
a different limit or no limit at all, if the convergence in probability of quadratic variation 
is replaced with convergence in distribution (see [1 and the references therein), so that 
the martingale results also do not appear applicablqj. 

The objective of this paper is to give a proof of Theorem ll.il using Tihomirov's method 
from [SOj. Originally applied to CLT in the dependent case, it turns to be remarkably 
suitable to the setting under consideration. Before proceeding to the proof in Section [3l 
we shall discuss in more details the application, in which the aforementioned convergence 
arises. 

2. Application to threshold estimation 

Suppose one observes a sample = (Xi,...,X„) from a threshold autoregressive 
(TAR) time series, generated by the recursion 

Xj = 5+(A:,_i)l|x,_i>e} +5-(^i-i)l{X,_i<n 3 G (2.1) 

where <?+(•) and (/_(•) are known functions and {sj) is a sequence of i.i.d. random variables 
with known probability density q{-). The unknown threshold parameter 6, taking values 



"'^in this connection, it is interesting to note, that in the analogous continuous time setting, the quadratic 
variation does converge in probability, essentially due to the continuity of the sample paths, see [14] 
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in an open interval G := (a, b) C M, is to be estimated from the sample X". TAR models, 
such as (j2.ip . have been the subject of extensive research in statistics and econometrics 
(see e.g. [21] and the recent surveys [22], [11], [8]). 

From the statistical analysis point of view, this estimation problem classifies as "singu- 
lar" , since the corresponding likelihood function 

n 

L^{X^;9) = -5+(^,-i)l{x,_i>e} -5-(^i-i)l{x,_,<e}) (2-2) 

is discontinuous in 6. Typically in such problems, the sequence of the Bayes estimators 



jQeLn{x^-9)7^{e)de 



n>l 



jQLn{x^-e)^{e)de' 

is asymptotically efficient in the minimax sense for an arbitrary continuous prior density 
7r(-) (see [E]). The asymptotic distribution of these estimators is determined by the 
weak limit of the likelihood ratios as follows. Let € be the true unknown value 
of the parameter and r„ an increasing sequence of numbers. The change of variables 
u = rn{9 - 9o) € r„(e - 6*0) =: U„ gives 

^ _ ^ /u„ uZn{u)Tr{eo + u/rn)du 

J^^Zn{u)TT{9o + u/rn)du ' 

where Zn{u), n > 1 are the rescaled likelihood ratios 

^"^"^= K{xn.,eo) ' 

If Tn can be chosen so that Zn{u), u E M converges weakly to a random process Z{u), 
li G R in an appropriate topology, then 

. d (TB,uZ(u)du 

holds (a comprehensive account of this approach can be found in [12] ) . 

For the likelihoods as in (j2.2p . a simple calculation (see eq. (4) in [9]) reveals that 

log Zniu) = log ^ ^(^^-1)) ^ ^>o (2.4) 

where Bn := [6o,6Q + u/n] and 6{x) := g+{x) —g-{x), and a similar expression is obtained 
for u < 0. It can be shown that (|2.3p indeed holds with := n, if (Xj) is a sufficiently fast 
mixing with the unique invariant probability density p{x;6o), and the sequence log Zn{u) 
converges weakly to the compound Poisson process 

v-n+(«), g{s++5{eo)) 

\ , (2-5) 

Ej=i log / N ^ < 



logZ(n) := 



where (e^) are i.i.d. copies of ei and 11"'' (u) and Il'^{u) are independent Poisson processes 
with the same intensity ^(6*0; ^o)- 
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The rate = n and the Poisson behavior is typical for discontinuous hkehhoods (see 
e.g. Ch. 5, [12] )• For the Unear TAR model, i.e. when g±{x) = p±x with constants 
P- ^ p^, this asymptotic appeared in [7] and the aforementioned generalization is taken 
from [9]. 

One particularly interesting ingredient in the proof, which is the main focus of this 
article, is the convergence of the finite dimensional distributions of Zn{u) to those of 
Z{u). In its prototypical form, the problem can be restated as follows. Consider the 
stationary Markov sequence (^j), generated by the recursion ()1.3p and let (cf. (12. 4j) ) 

n 

5n:=E/(^^)l{^.-ieB„}, (2.6) 
i=i 

where '■= [0, 1/n] and /(•) is a measurable function. It is required to show that, the 
sums {Sn) converge weakly to the compound Poisson random variable with i.i.d. jumps, 
distributed as f{si), and the intensity p(0), where p{-) is the unique invariant density of 
(X,). 

This convergence is not hard to prove using the blocks technique: Sn is partitioned into, 
say, n^/^ blocks of n^/^ consecutive summands, n^/^ of which are discarded. Removing 
total of n^/^ • n^/^ out of n terms in the sum does not alter its limit, but the residual 
blocks become nearly independent, if the mixing is fast enough. Moreover, a single event 
{Xj G Bn} occurs within each block with probability of order and hence the sum over 

approximately independent n^/^ blocks yields the claimed compound Poisson behavior. 
This approach dates back to at least [15] in the Poisson case, and the details for the 
compound Poisson setting can be found in [9|. 

An alternative proof now can be given by applying Theorem 11.11 
Corollary 2.1. Let (Xj) be defined by ()1.3p and Sn by (j2.6p . Assume that 

(i) ei has positive Lipschitz continuous bounded probability density q{x), x S M with 
the finite first absolute moment \x\q{x)dx < oo 

(ii) for some r € (0, 1) and C > 0, 

\h{x)\ < r\x\, V l^l > C 

(iii) E|/(ei)| < oo and for some constant C , 

sup \f{z-h{x))\<C' 

z,xe[0,n-l] 

for all n large enough. 

Then the Markov process {Xj) has unique invariant density p{x), x S M, which is 
positive, Lipschitz continuous and bounded; for stationary {Xj), the sums {Sn) converge 
weakly to the compound Poisson random variable with intensity p{0) and i.i.d. jumps with 
the same distribution as /(ffi). 

Remark 2.2. The Corollary 12.11 verifies the weak convergence of the one-dimensional dis- 
tributions of the processes log Zn{u) from (12. 4p to those of log Z{u), n € M defined in (j2.5p . 
The convergence of finite dimensional distributions of higher orders can be treated along 
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the same lines. The hmit ()2.3p then fohows from the tightness of the sequence of processes 
log Zn{u) (see [9] for further details). 

Remark 2.3. The assumption [nil holds if e.g. /(•) and h{-) are continuous at 0. 

Proof. Under the assumptions |i] and [nl the standard ergodic theory of Markov chains 
(see e.g. Theorem 16.0.2 in [16]) implies that {Xj) is irreducible, aperiodic and positive 
recurrent Markov chain with the unique invariant measure. Due to the additive structure of 
the recursion (jl.Sp . the invariant measure has density p(-), which is positive and continuous 
with the same Lipschitz constant Lg as the density q{-) and ||p||oo < Iklloo := s'^PxeRli^)- 
Moreover, (Xj) is geometrically mixing, i.e. there exist positive constants R and p < 1, 
such that for any measurable function \g{x)\ < 1 

K{g{Xj)\3-i) - I g{x)p{x)dx 



<Rp>^\ j>i, (2.7) 
where = a{si,i < j}. Define Ynj := /(ei)l{x,_igB„}, then 

E\Ynj\ = E\fiej)\F{Xj_i e B^) = IE|/(ei)| / p{x)dx < E\f{ei)\\\q\\oon-\ 

Jo 

and similarly 

HYnj / 0) < P(Xj_i e Bn) < ||g||oon-\ 

Further, for i < j — 1, 

E|y„,,l{y„,^o}| = IE|/(eO|l{x,_ieB„}F(>^«,, / 0\3^,) < 

E|/(ei)|l{x,.ieB„}IP(^i-i e B^m < 
E|/(£i)|l{x,_ieB„}lkllocn-i < E\f{ei)\\\q\\l7^-^ 

and 

El{x,_,GB„}lE(|/(ei)|l{x,_ieB„}|^^.) <E|/(ei)|||g||Lr^"'. 

Similarly, 

IE|^nj-i|l{y„,j^o} < lEl{x,_2eBn}l/('^i-i)|l{x,-iGBn} = 

El{x,_2GS„}E(|/(ej-i)|l{/i(Xj_2)+ej-ieB„}|3~i-2) = 

El{x,_2Gi?„} / i/(2/)|l{h(x,_2)+j/eBn}9(y)f^2/ = 

El{x,_2GB„} / |/(^ - /i(^,-2))|<?(2 - hiXj.2))dz < 
Jo 

lkllooEl{x,_2eBn}^~^ sup \f{z-h{x))\<\\q\\l^n-'^C' 

and 

El{y„„_i^o}l>^n,,| < El{x^_,eB„}l/(e,)|l{x,_iGB„} < E|/(ei)| ||g||^n-2. 
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Hence ED is satisfied for all n large enough with 



Ci 



Vl)(E|/(ei)| VC'vl). 



Further, by the Markov property, 
E(r„,,|3-,_2) = E{l{x,.,^B^}Hf{ej)\Jj-i)\^j~2] 



and 

Hence by ()2.7p . for i < j — 1, 



< |E/(ei)|||(7|loon-i <Cin-i. 



E(y„j|j,) -Ey„j = E(^E(y„j|^j_2)|^») -EE(y„j|j,-„5 

E(i/(Xj_2)|^i) -E//(Xj_2) <Cin-^Riy^-'-^, 

and 1X2] holds with i = 2 and 

a(fc) := i^p'^"^. (2.8) 
The assumption IA3I is checked similarly. Finally, 

where (p{t) = Ee'*-^''^^^ and interchanging derivative and the expectation is valid by the 
dominated convergence and [ml 

Since the invariant density is Lipschitz, it follows that 



1 



0„j(t)-p(O)-(^(t) 



n 



which verifies IA4I and the claim now follows from Theorem II. 1[ In fact, the assumption 
IA5I holds by virtue of (|2.8p and the Levy distance to the limit distribution converges at 
the rate, claimed in (jl.2p . □ 

3. Proof of Theorem 11.11 

Tihomirov's approach ^20] is applicable, when the characteristic function of the limit 
distribution uniquely solves an ordinary differential equation. Roughly, the idea is then 
to show that the characteristic functions of the prelimit distributions satisfy the same 
equation in the limit. 

The characteristic function of the compound Poisson distribution with intensity n and 
characteristic function of the jumps ip{t) is given by 

which solves uniquely the initial value problem 

Tpit) = iiLp{t)ip{t), V(0) = 1, t G K. 
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Since E|5„| < oo, the characteristic function ipn{t) := Ee'**^" is continuously differen- 
tiable and An(t) := ip{t) — ipnit) satisfies 

An{t) = flip{t)An{t) + rn{t), t £ R, 

subject to A„(0) = 0, where r„(t) := fnp{t)ipn{t) — tpnit)- Solving for An{t) gives 

An{t) = exp [^x{^{t) - (^(s))) r„(s)ds, t > 0. (3.1) 
As we show below, for any constant b > 0, such that 61ogn is a positive integer, 



i{t)\ < Csn-i + 3Cia(61ogn) + 8C?b^-^, t > (3.2) 

n 



and, since \^{t)\ < 1, it follows from (j3.ip that 

\An{t)\ < e^^*^* \rn{s)\ds < e^^ ^Can-^ + 3Cia(Mogn) + 8C^b^^^^ t, t>0. (3.3) 

Similar bound holds for t < and the claimed weak limit (jl.ip follows, once we check 
(j3.2p . To this end, we have 



, ^ n n n 

Mt) := ^Ee^*^" = exp (it ^ = J]] Eiy„,fc exp [it F,,, 



(it 



Eiy„,fce"^-'= exp (it Yn,j) = J] Eiy„,fce'*^".'= exp (it ^ y„j) + 

j^fc k=l |j— /c|>felogn 



Eiyn,fce'*^"-^ ( exp (it ^ - exp (it ^ y„j) Y.= Ji + J2 

k=l ^ jT^fc |j-fc|>felogn ^ 

where we used E|S„| < 00 and the dominated convergence to interchange the derivative 
and the expectation. Note that |e'^ — e'^^"*"^^! < 21{j^-^o} for x,y €M, and hence by 
the assumption lAll 



\J2\<Y,^K,k 
k=l 

n 

2 J];E|y„,fc|i|^^ 



exp [ it ^ Yn,jj - exp (it ^ y„j- 



< 



fc=l 



j-fe|<61ogn,j7^fe 



(3.4) 
logn 



n 



Further, by the triangle inequality 



Eiy„^fce'*^"''= exp 


(•„ 








— fc|>felogn 


EiYn,ke'^^"''' exp 










— /c|>61ogn 


EYn,ke''^"''' exp ( 


it 








fc >blogn. ^ 



Similarly to ()3.4p . we have 



E exp At ^ y„ j") - E exp (it ^ y„ j") 

^ |j-fc|>61ogn ^ ^ j=l ^ 



< 



J— fc|>felog n 

: J3 + ^4- 



+ 



I J4I < E|y„,fc|E 



exp ( it ^ Ynj j - exp ( it ^ y„ j j 

^ |j-fc|>felogn ^ ^ j=l ^ 



< 



2E|y„,,|El|^^^_^^^^^^^^^^^^,^o} < 2E|yn,.| E ^ 0) ^ 

|j— A:|<blogn 



log n 



n 



2 ■ 



For brevity, define 

[/:=expAt y„,, ), y := y„,fce'*^-S := exp 

j<fe— fclogn 

By the triangle inequality, 



j>fc+61og n 



|Ec/yvF - EVEC/T^I < \mvw - mv'&w\+ 

\KUV¥w - mm^mv\ + \&u¥.v¥w - w/ww\. 

Since U and V are 3"fc-measurable, \U\ < 1, \W\ < 1 and E|y| < Cin"^. lA3l implies 

lEf/yiy - Ef/yEVF| < E|C/y||E(VF|Jfc) - m¥\ < Cin-^a{blogn), 
and, since U is measurable with respect to 3'k-biogn, 

\EUEVEW - EVEUW\ < \EV\E\U\\EW - E{W\3'k-biogn)\ < Cin"^a(261ogn). 
Further, by I A2I for 61ogn > i, 

|E[/yEW^-E[/EyEM^| < |EVF|E|[/j|E(y|3"fc_f,iog„) -Ey| < Cin-^a{b log n). 



Hence 



\J3 



EUVW - EVEUW 



< 3Cin ^a(61ogn), 
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and consequently, by 

Jl - fJ.ip{t)lpn(t) 



< 



EiYn^ke''^"'" exp (it y„j) - fiip{t)Mt) 

k=l |j— /c|>fclogn 

Can"^ + 3Cia(61ogn) + ACfb^-^. 



n 

Assembling all parts together, we obtain (j3.2p . 

The bound (jl.2p for the Levy metric is obtained by means of Zolotorev's inequality [23], 

which in view of the bound in (I3.3p gives 

L(^{Sn),^{S)) < —(c2n~^ + 3Cia{b\ogn) + 2C!b^^)T + 2e^^. (3.5) 

If Q!(/c) decays geometrically as in IA51 the bound (II. 2p is obtained by choosing T = n^/^ 
and b > — \-r- □ 

Remark 3.1. The rate in (jl.2p is not as sharp as the one, obtained by Tihomirov in |20j in 
the CLT case. Apparently, the deficiency originates in the specific form of the compound 
Poisson characteristic function ip{t) = e^^'^^^^~^\ which does not vanish as t — > oo. More 
specifically, the integration kernel K{s,t) := e'^^'^^*^"'^'-*^^ in (13. ip does not decay when t 
is fixed and s decreases, which contributes the linear growth in t of the right hand side of 
(13. 3p and the corresponding linear growth in T in ()3.5p . In the Gaussian case, this kernel 
has the form K{s,t) := e^"/^"*"/^ (see eq. (3.25) page 809 in |20]). which yields better 
balance between growth in t and the decrease in n. It seems that in the compound Poisson 
setting under consideration the rate cannot be essentially improved within the framework 
of Tihomirov's method. 

Acknowledgement. The authors are grateful to Y. Kutoyants and R. Liptser for the 
enlightening discussions on the subject of this article. We also appreciate referee's sugges- 
tion, which led to the ultimate improvement of the rate in (II. 2p . 
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